Search CORE

5 research outputs found

Modeling the successes and failures of content-based platforms

Author: Dev Himel
Publication venue
Publication date: 01/05/2021
Field of study

Online platforms, such as Quora, Reddit, and Stack Exchange, provide substantial value to society through their original content. Content from these platforms informs many spheres of life—software development, finance, and academic research, among many others. Motivated by their content's powerful applications, we refer to these platforms as content-based platforms and study their successes and failures. The most common avenue of studying online platforms' successes and failures is to examine user growth. However, growth can be misleading. While many platforms initially attract a massive user base, a large fraction later exhibit post-growth failures. For example, despite their enormous growth, content-based platforms like Stack Exchange and Reddit have struggled with retaining users and generating high-quality content. Motivated by these post-growth failures, we ask: when are content-based platforms sustainable? This thesis aims to develop explanatory models that can shed light on the long-term successes and failures of content-based platforms. To this end, we conduct a series of large-scale empirical studies by developing explanatory and causal models. In the first study, we analyze the community question answering websites in Stack Exchange through the economic lens of a "market". We discover a curious phenomenon: in many Stack Exchange sites, platform success measures, such as the percentage of the answered questions, decline with an increase in the number of users. In the second study, we identify the causal factors that contribute to this decline. Specifically, we show that impression signals such as contributing user's reputation, aggregate vote thus far, and position of content significantly affect the votes on content in Stack Exchange sites. These unintended effects are known as voter biases, which in turn affect the future participation of users. In the third study, we develop a methodology for reasoning about alternative voting norms, specifically how they impact user retention. We show that if the Stack Exchange community members had voted based upon content-based criteria, such as length, readability, objectivity, and polarity, the platform would have attained higher user retention. In the fourth study, we examine the effect of user roles on the health of content-based platforms. We reveal that the composition of Stack Exchange communities (based on user roles) varies across topical categories. Further, these communities exhibit statistically significant differences in health metrics. Altogether, this thesis offers some fresh insights into understanding the successes and failures of content-based platforms

Illinois Digital Environment for Access to Learning and Scholarship Repository

The Size Conundrum: Why Online Knowledge Markets Can Fail at Scale

Author: Dev Himel
Geigle Chase
Hu Qingtao
Sundaram Hari
Zheng Jiahui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

In this paper, we interpret the community question answering websites on the StackExchange platform as knowledge markets, and analyze how and why these markets can fail at scale. A knowledge market framing allows site operators to reason about market failures, and to design policies to prevent them. Our goal is to provide insights on large-scale knowledge market failures through an interpretable model. We explore a set of interpretable economic production models on a large empirical dataset to analyze the dynamics of content generation in knowledge markets. Amongst these, the Cobb-Douglas model best explains empirical data and provides an intuitive explanation for content generation through concepts of elasticity and diminishing returns. Content generation depends on user participation and also on how specific types of content (e.g. answers) depends on other types (e.g. questions). We show that these factors of content generation have constant elasticity---a percentage increase in any of the inputs leads to a constant percentage increase in the output. Furthermore, markets exhibit diminishing returns---the marginal output decreases as the input is incrementally increased. Knowledge markets also vary on their returns to scale---the increase in output resulting from a proportionate increase in all inputs. Importantly, many knowledge markets exhibit diseconomies of scale---measures of market health (e.g., the percentage of questions with an accepted answer) decrease as a function of number of participants. The implications of our work are two-fold: site operators ought to design incentives as a function of system size (number of participants); the market lens should shed insight into complex dependencies amongst different content types and participant actions in general social networks.Comment: The 27th International Conference on World Wide Web (WWW), 201

arXiv.org e-Print Archive

Crossref

A Generative Model for Discovering Action-Based Roles and Community Role Compositions on Community Question Answering Platforms

Author: Dev Himel
Geigle Chase
Sundaram Hari
Zhai ChengXiang
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 06/07/2019
Field of study

This paper proposes a generative model for discovering user roles and community role compositions in Community Question Answering (CQA) platforms. While past research shows that participants play different roles in online communities, automatically discovering these roles and providing a summary of user behavior that is readily interpretable remains an important challenge. Furthermore, there has been relatively little insight into the distribution of these roles between communities. Does a community’s composition over user roles vary as a function of topic? How does it relate to the health of the underlying community? Does role composition evolve over time? The generative model proposed in this paper, the mixture of Dirichlet-multinomial mixtures (MDMM) behavior model can (1) automatically discover interpetable user roles (as probability distributions over atomic actions) directly from log data, and (2) uncover community-level role compositions to facilitate such cross-community studies. A comprehensive experiment on all 161 non-meta communities on the StackExchange CQA platform demonstrates that our model can be useful for a wide variety of behavioral studies, and we highlight three empirical insights. First, we show interesting distinctions in question-asking behavior on StackExchange (where two distinct types of askers can be identified) and answering behavior (where two distinct roles surrounding answers emerge). Second, we find statistically significant differences in behavior compositions across topical groups of communities on StackExchange, and that those groups that have statistically significant differences in health metrics also have statistically significant differences in behavior compositions, suggesting a relationship between behavior composition and health. Finally, we show that the MDMM behavior model can be used to demonstrate similar but distinct evolutionary patterns between topical groups

Association for the Advancement of Artificial Intelligence: AAAI Publications